LIBMF: A Library for Parallel Matrix Factorization in Shared-memory Systems

نویسندگان

  • Wei-Sheng Chin
  • Bo-Wen Yuan
  • Meng-Yuan Yang
  • Yong Zhuang
  • Yu-Chin Juan
  • Chih-Jen Lin
چکیده

Matrix factorization (MF) plays a key role in many applications such as recommender systems and computer vision, but MF may take long running time for handling large matrices commonly seen in the big data era. Many parallel techniques have been proposed to reduce the running time, but few parallel MF packages are available. Therefore, we present an open source library, LIBMF, based on recent advances of parallel MF for sharedmemory systems. LIBMF includes easy-to-use command-line tools, interfaces to C/C++ languages, and comprehensive documentation. Our experiments demonstrate that LIBMF outperforms state of the art packages. LIBMF is BSD-licensed, so users can freely use, modify, and redistribute the code.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nonnegative Matrix Factorization via Newton Iteration for Shared-memory Systems∗

Nonnegative Matrix Factorization (NMF) can be used to approximate a large nonnegative matrix as a product of two smaller nonnegative matrices. This paper shows in detail how an NMF algorithm based on Newton iteration can be derived utilizing the general Karush-KuhnTucker (KKT) conditions for first-order optimality. This algorithm is suited for parallel execution on shared-memory systems. It was...

متن کامل

Solving linear systems with vectorized WZ factorization

Abstract In the paper we present a vectorized algorithm for WZ factorization of a matrix which was implemented with the BLAS1 library. We present the results of numerical experiments which show that vectorization accelerates the sequential WZ factorization. Next, we parallelized both algorithms for a two-processor shared memory machine using the OpenMP standard. We present performances of these...

متن کامل

Tall and Skinny QR Matrix Factorization Using Tile Algorithms on Multicore Architectures

To exploit the potential of multicore architectures, recent dense linear algebra libraries have used tile algorithms, which consist in scheduling a Directed Acyclic Graph (DAG) of tasks of fine granularity where nodes represent tasks, either panel factorization or update of a block-column, and edges represent dependencies among them. Although past approaches already achieve high performance on ...

متن کامل

Developing a High Performance Software Library with MPI and CUDA for Matrix Computations

Nowadays, the paradigm of parallel computing is changing. CUDA is now a popular programming model for general purpose computations on GPUs and a great number of applications were ported to CUDA obtaining speedups of orders of magnitude comparing to optimized CPU implementations. Hybrid approaches that combine the message passing model with the shared memory model for parallel computing are a so...

متن کامل

DuctTeip: A TASK-BASED PARALLEL PROGRAMMING FRAMEWORK FOR DISTRIBUTED MEMORY ARCHITECTURES∗

Current high-performance computer systems used for scientific computing typically combine shared memory compute nodes in a distributed memory environment. Extracting high performance from these complex systems requires tailored approaches. Task based parallel programming has been successful both in simplifying the programming and in exploiting the available hardware parallelism. We have previou...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 17  شماره 

صفحات  -

تاریخ انتشار 2016